2. Context: Prologin
● French national programming contest
for students under 20
● Online qualification with algorithmic
exercises
● Thousands of applications every year
● C, C++, C#, Python, Haskell, OCaml,
Java, PHP, …
https://prologin.org
3. Problem: secure untrusted code evaluation
(We are lazy and we want to grade our students without looking at their code.)
Aimed at: teachers, programming contests, learning websites,
Design goals:
● Simple enough to be used by everyone (teachers, developers, tinkerers…)
● Fast and precise (overhead matters in programming contests)
● Secure enough to be used in online websites (or malicious students)
● Abstract the languages in a modular way
11. Solutions considered that don’t really work:
● ptrace
○ Overhead to monitor the system calls
○ Multiprocessing doesn’t work
○ Not multiplatform
○ Lot of things to handle
○ Runtimes can do weird things
● Docker
○ Overhead because overkill
○ Not precise enough
Isolation backend
12. Isolation backend
Backends :
● “Big brother” (chroot + setrlimit + memory watchdog + outside firewall)
○ Previous in-house solution
○ Isolation is very sloppy
● Isolate (https://github.com/ioi/isolate)
○ Resources limitation using cgroups
○ Isolation with namespaces
○ Lightweight FS isolation (chroot + mount --bind)
● Nsjail? (http://nsjail.com/)
○ Could be implemented as an alternate backend
○ You know how every time you do something, Google comes and does it 10x better?
13. Language module system
Python 3.6 __init_subclass__ in action!
from camisole.models import Lang, Program
class Python(Lang, name='Python'):
source_ext = '.py'
interpreter = Program('python3')
reference_source = r'print(42)'
Load arbitrary language modules with:
$ export CAMISOLEPATH=~/mylangs
14. (Simple, except for Java.)
import re
import subprocess
from pathlib import Path
from camisole.models import Lang, Program
RE_WRONG_FILENAME_ERROR = re.compile(r...,')
PSVMAIN_SIGNATURE = 'public static void main('
PSVMAIN_DESCRIPTOR = 'descriptor: ([Ljava/lang/String;)V'
class Java(Lang):
source_ext = '.java'
compiled_ext = '.class'
compiler = Program('javac', env={'LANG': 'C'},
version_opt='-version')
interpreter = Program('java', version_opt='-version')
# /usr/lib/jvm/java-8-openjdk/jre/lib/amd64/jvm.cfg links to
# /etc/java-8-openjdk/amd64/jvm.cfg
allowed_dirs = ['/etc/java-8-openjdk']
# ensure we can parse the javac(1) stderr
extra_binaries = {'disassembler': Program('javap',
version_opt='-version')}
reference_source = r'''
class SomeClass {
static int fortytwo() {
return 42;
}
static class Subclass {
// nested psvmain! wow!
public static void main(String args[]) {
System.out.println(SomeClass.fortytwo());
}
}
}
'''
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# use an illegal class name so that javac(1) will spit out the actual
# class named used in the source
self.class_name = '1337'
# we give priority to the public class, if any, so keep a flag if we
# found such a public class
self.found_public = False
try:
self.heapsize = self.opts['execute'].pop('mem')
except KeyError:
self.heapsize = None
def compile_opt_out(self, output):
# javac has no output directive, file name is class name
return []
async def compile(self):
# try to compile with default class name (Main)
retcode, info, binary = await super().compile()
if retcode != 0:
# error: public class name is not '1337' -- obviously, it's
illegal,
# so find what it actually is
match = RE_WRONG_FILENAME_ERROR.search(info['stderr'])
if match:
self.found_public = True
self.class_name = match.group(1)
# retry with new name
retcode, info, binary = await super().compile()
return (retcode, info, binary)
def source_filename(self):
return self.class_name + self.source_ext
def execute_filename(self):
# return eg. Main.class
return self.class_name + self.compiled_ext
def execute_command(self, output):
cmd = [self.interpreter.cmd]
# Use the memory limit as a maximum heap size
if self.heapsize is not None:
cmd.append(f'-Xmx{self.heapsize}k')
# foo/Bar.class is run with $ java -cp foo Bar
cmd += ['-cp',
str(Path(self.filter_box_prefix(output)).parent),
self.class_name]
return cmd
def find_class_having_main(self, classes):
for file in classes:
# run javap(1) with type signatures
try:
stdout = subprocess.check_output(
[self.extra_binaries['disassembler'].cmd, '-s',
str(file)],
stderr=subprocess.DEVNULL,
env=self.compiler.env)
except subprocess.SubprocessError:
continue
# iterate on lines to find p s v main() signature and
then
# its descriptor on the line below; we don't rely on
the type
# from the signature, because it could be String[], String...
or
# some other syntax I'm not even aware of
lines = iter(stdout.decode().split('n'))
for line in lines:
if line.lstrip().startswith(PSVMAIN_SIGNATURE):
if next(lines).lstrip() ==
PSVMAIN_DESCRIPTOR:
return file.stem
def read_compiled(self, path, isolator):
# in case of multiple or nested classes, multiple .class
files are
# generated by javac
classes = list(isolator.path.glob('*.class'))
files = [(file.name, file.open('rb').read()) for file in
classes]
if not self.found_public:
# the main() may be anywhere, so run javap(1) on all
.class
new_class_name = self.find_class_having_main(classes)
if new_class_name:
self.class_name = new_class_name
return files
def write_binary(self, path, binary):
# see read_compiled(), we need to write back all .class
files
# but give only the main class name (execute_filename())
to java(1)
for file, data in binary:
with (path / file).open('wb') as c:
c.write(data)
return path / self.execute_filename()
15. Low-level API
When simple single-file evaluation doesn’t suit your needs:
opts = {'time': 5, 'mem': 5000}
isolator = Isolator(opts, allowed_dirs=['/home'])
async with isolator:
await isolator.run(command, env=env, data=input())
return (isolator.stdout, isolator.stderr)
16. Deployment
We autobuild an OVA (VirtualBox export) using packer.io:
https://camisole.prologin.org/ova/camisole-latest.ova
Importing it in VirtualBox and running the VM just works™ and gives you an HTTP
server with all the built-in languages (Ada, C, Brainfuck, C#, C++, F#, Haskell, Java,
Javascript, Lua, OCaml, Pascal, Perl, PHP, Python, Ruby, Rust, Scheme, VisualBasic).
Great for non-tech savvy people!
17. Conclusion
● Elegant API for a hard problem: good abstraction!
● Linux isolation is awesome
● Python 3.5 and 3.6 features are awesome (f-strings, __init_subclass__, async…)
Will our simplicity-centered design will make the project gain traction? :-)
Full documentation: https://camisole.prologin.org
Contribute! https://github.com/prologin/camisole
Contact: #prologin @ irc.freenode.net
antoine.pietri@prologin.org alexandre.macabies@prologin.org